智能论文笔记

Cross-Domain Few-Shot Relation Extraction via Representation Learning and Domain Adaptation

Zhongju Yuan , Zhenkun Wang , Genghui Li

分类：自然语言处理 | 人工智能

2022-12-05

Cross-domain few-shot relation extraction poses a great challenge for the existing few-shot learning methods and domain adaptation methods when the source domain and target domain have large discrepancies. This paper proposes a method by combining the idea of few-shot learning and domain adaptation to deal with this problem. In the proposed method, an encoder, learned by optimizing a representation loss and an adversarial loss, is used to extract the relation of sentences in the source and target domain. The representation loss, including a cross-entropy loss and a contrastive loss, makes the encoder extract the relation of the source domain and keep the geometric structure of the classes in the source domain. And the adversarial loss is used to merge the source domain and target domain. The experimental results on the benchmark FewRel dataset demonstrate that the proposed method can outperform some state-of-the-art methods.

translated by 谷歌翻译

A Generalized Scalarization Method for Evolutionary Multi-objective Optimization

Ruihao Zheng , Zhenkun Wang

分类：神经与进化计算

2022-12-03

The decomposition-based multi-objective evolutionary algorithm (MOEA/D) transforms a multi-objective optimization problem (MOP) into a set of single-objective subproblems for collaborative optimization. Mismatches between subproblems and solutions can lead to severe performance degradation of MOEA/D. Most existing mismatch coping strategies only work when the $L_{\infty}$ scalarization is used. A mismatch coping strategy that can use any $L_{p}$ scalarization, even when facing MOPs with non-convex Pareto fronts, is of great significance for MOEA/D. This paper uses the global replacement (GR) as the backbone. We analyze how GR can no longer avoid mismatches when $L_{\infty}$ is replaced by another $L_{p}$ with $p\in [1,\infty)$, and find that the $L_p$-based ($1\leq p<\infty$) subproblems having inconsistently large preference regions. When $p$ is set to a small value, some middle subproblems have very small preference regions so that their direction vectors cannot pass through their corresponding preference regions. Therefore, we propose a generalized $L_p$ (G$L_p$) scalarization to ensure that the subproblem's direction vector passes through its preference region. Our theoretical analysis shows that GR can always avoid mismatches when using the G$L_p$ scalarization for any $p\geq 1$. The experimental studies on various MOPs conform to the theoretical analysis.

translated by 谷歌翻译

DGI: Easy and Efficient Inference for GNNs

Peiqi Yin , Xiao Yan , Jinjing Zhou , Qiang Fu , Zhenkun Cai , James Cheng , Bo Tang , Minjie Wang

分类：机器学习

2022-11-28

While many systems have been developed to train Graph Neural Networks (GNNs), efficient model inference and evaluation remain to be addressed. For instance, using the widely adopted node-wise approach, model evaluation can account for up to 94% of the time in the end-to-end training process due to neighbor explosion, which means that a node accesses its multi-hop neighbors. On the other hand, layer-wise inference avoids the neighbor explosion problem by conducting inference layer by layer such that the nodes only need their one-hop neighbors in each layer. However, implementing layer-wise inference requires substantial engineering efforts because users need to manually decompose a GNN model into layers for computation and split workload into batches to fit into device memory. In this paper, we develop Deep Graph Inference (DGI) -- a system for easy and efficient GNN model inference, which automatically translates the training code of a GNN model for layer-wise execution. DGI is general for various GNN models and different kinds of inference requests, and supports out-of-core execution on large graphs that cannot fit in CPU memory. Experimental results show that DGI consistently outperforms layer-wise inference across different datasets and hardware settings, and the speedup can be over 1,000x.

translated by 谷歌翻译

Dynamic Multi-objective Ensemble of Acquisition Functions in Batch Bayesian Optimization

Jixiang Chen , Fu Luo , Zhenkun Wang

分类：神经与进化计算

2022-06-22

贝叶斯优化（BO）是解决昂贵优化问题的典型方法。在BO的每次迭代中，使用先前评估的解决方案训练了高斯工艺（GP）模型。然后，推荐下一个用于昂贵评估的候选解决方案，通过在训练有素的替代模型上最大化廉价评估的采集功能。采集函数在优化过程中起着至关重要的作用。但是，每个采集函数都有自己的优势和劣势，没有任何单一的获取功能能够一致地在各种问题上胜过其他功能。为了更好地利用不同采集功能的优势，我们为批处理提出了一种新方法。在每次迭代中，三个采集函数都是根据其当前和历史性能动态选择的，以形成多目标优化问题（MOP）。使用进化多目标算法来优化这种拖把，可以获得一组非主导的解决方案。为了选择批处理解决方案，我们根据它们在三个采集函数上的相对性能将这些非主导的解决方案对几层进行排名。经验结果表明，所提出的方法与有关不同问题的最新方法具有竞争力。

translated by 谷歌翻译

A Multi-level Neural Network for Implicit Causality Detection in Web Texts

Shining Liang , Wanli Zuo , Zhenkun Shi , Sen Wang , Junhu Wang , Xianglin Zuo

分类：自然语言处理 | 人工智能 | 机器学习

2019-08-18

来自文本的采矿因果关系是一种复杂的和至关重要的自然语言理解任务，对应于人类认知。其解决方案的现有研究可以分为两种主要类别：基于特征工程和基于神经模型的方法。在本文中，我们发现前者具有不完整的覆盖范围和固有的错误，但提供了先验知识;虽然后者利用上下文信息，但其因果推断不足。为了处理限制，我们提出了一个名为MCDN的新型因果关系检测模型，明确地模拟因果关系，而且，利用两种方法的优势。具体而言，我们采用多头自我关注在Word级别获得语义特征，并在段级别推断出来的SCRN。据我们所知，关于因果关系任务，这是第一次应用关系网络。实验结果表明：1）该方法对因果区检测进行了突出的性能; 2）进一步分析表现出MCDN的有效性和稳健性。

translated by 谷歌翻译

Application of Neural Network in the Prediction of NOx Emissions from Degrading Gas Turbine

Zhenkun Zheng , Alan Rezazadeh

分类：机器学习 | (统计)机器学习

2022-09-19

本文旨在应用神经网络算法来预测天然燃气轮机降解的过程响应（NOX排放）。预测建模中考虑了九个不同的过程变量或预测因子。据发现，通过神经网络算法训练的模型应在培训和验证集中使用最新数据，以计算系统退化的影响。训练和验证集的R平方值证明了模型的有效性。残留物图没有任何清晰的模式，表明该模型是合适的。证明了过程变量的重要性的排名，并确认了过程变量的重要性。通过使用神经网络算法训练的模型表明了过程变量的最佳设置，以达到从降解的燃气轮机系统中达到NOX排放的最小值。

translated by 谷歌翻译

A Highly Configurable Hardware/Software Stack for DNN Inference Acceleration

Suvadeep Banerjee , Steve Burns , Pasquale Cocchini , Abhijit Davare , Shweta Jain , Desmond Kirkpatrick , Anton Sorokin , Jin Yang , Zhenkun Yang

分类：机器学习

2021-11-29

这项工作侧重于特定于域的加速器的有效敏捷设计方法。我们采用垂直开发堆栈的功能逐个功能增强，并将其应用于TVM / VTA推理加速器。我们已经增强了VTA设计空间，并启用了用于额外工作负载的端到端支持。这是通过增强VTA微架构和指令集架构（ISA）来实现的，以及通过增强TVM编译堆栈来支持各种VTA配置。 VTA TSIM实现（基于凿子）已通过ALU / GEMM执行单元的完全流水线版本增强。在TSIM中，内存宽度现在可以在8-64字节之间。对于支持较大的刮板，已经使场宽度更加灵活。已添加新的说明：元素 - WISE 8位乘法，支持深度卷积，并使用焊盘值的选择加载以支持最大池。还添加了对更多层和更好的双缓冲。完全管制的ALU / GEMM有助于显着帮助：4.9倍的循环较少，最小区域更改为在默认配置下运行RESET-18。可以实例化特征在于11.5倍的循环计数的配置，以12倍的循环计数更大的区域。显示了区域性能帕累托曲线上的许多点，展示了执行单元尺寸，内存接口宽度和刻痕尺寸的余额。最后，VTA现在能够运行MobileNet 1.0和所有层进行Resnet，包括先前禁用的池和完全连接的图层。 TVM / VTA架构始终在几分钟内以RTL呈现端到端工作量评估。通过我们的修改，它现在提供了更大的可行配置，具有广泛的成本与性能。所有提到的所有功能都可以在OpenSource叉中提供，而这些功能的子集已经上游。

translated by 谷歌翻译

Cross Modal Transformer via Coordinates Encoding for 3D Object Dectection

Junjie Yan , Yingfei Liu , Jianjian Sun , Fan Jia , Shuailin Li , Tiancai Wang , Xiangyu Zhang

分类：计算机视觉

2023-01-03

In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.

translated by 谷歌翻译

A Survey On Few-shot Knowledge Graph Completion with Structural and Commonsense Knowledge

Haodi Ma , Daisy Zhe Wang

分类：自然语言处理 | 人工智能 | 机器学习

2023-01-03

Knowledge graphs (KG) have served as the key component of various natural language processing applications. Commonsense knowledge graphs (CKG) are a special type of KG, where entities and relations are composed of free-form text. However, previous works in KG completion and CKG completion suffer from long-tail relations and newly-added relations which do not have many know triples for training. In light of this, few-shot KG completion (FKGC), which requires the strengths of graph representation learning and few-shot learning, has been proposed to challenge the problem of limited annotated data. In this paper, we comprehensively survey previous attempts on such tasks in the form of a series of methods and applications. Specifically, we first introduce FKGC challenges, commonly used KGs, and CKGs. Then we systematically categorize and summarize existing works in terms of the type of KGs and the methods. Finally, we present applications of FKGC models on prediction tasks in different areas and share our thoughts on future research directions of FKGC.

translated by 谷歌翻译

Reference Twice: A Simple and Unified Baseline for Few-Shot Instance Segmentation

Yue Han , Jiangning Zhang , Zhucun Xue , Chao Xu , Xintian Shen , Yabiao Wang , Chengjie Wang , Yong Liu , Xiangtai Li

分类：计算机视觉

2023-01-03

Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.

translated by 谷歌翻译